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ABSTRACT 

We analyze the three-year SDSS-II Superernova (SN) Survey data and identify a sample of 1070 
photometric SN la candidates based on their multi-band light curve data. This sample consists of SN 
candidates with no spectroscopic confirmation, with a subset of 210 candidates having spectroscopic 
redshifts of their host galaxies measured, while the remaining 860 candidates are purely photometric 
in their identification. We describe a method for estimating the efficiency and purity of photometric 
SN la classification when spectroscopic confirmation of only a limited sample is available, and demon- 
strate that SN la candidates from SDSS-II can be identified photometrically with ^ 91% efhciency and 
with a contamination of ~ 6%. Although this is the largest uniform sample of SN candidates to date 
for studying photometric identification, we find that a larger spectroscopic sample of contaminating 
sources is required to obtain a better characterization of the background events. A Hubble diagram 
using SN candidates with no spectroscopic confirmation, but with host galaxy spectroscopic redshifts, 
yields a distance modulus dispersion that is only ^ 20 — 40% larger than that of the spectroscopically- 
confirmed SN la sample alone with no significant bias. A Hubble diagram with purely photometric 
classification and redshift-distance measurements, however, exhibit biases that require further inves- 
tigation for precision cosmology. 
Subject headings: cosmology: observations — supernovae: general — surveys 



1. INTRODUCTION 

Measurements of lumin osity distances to nearby 
Type la Supervova (SN la) ()Phillipslll99"l iHamuv et al.l 
Il996al ) and their distant counterparts have played a 
central role in modern cosmology and t he remark- 
able discovery of an accelerating universe ()Riess et al.l 
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119981 iPerlmutter et al.l Il999[ ). Many dedicated su- 
pernova (SN) surveys and follow-up programs have 
since then acquired light curves and spectra for sev- 
eral thousands of SN in various redshift ranges: 1) at 
z < 0.1 by the Lick Observatory Supernova S earch 
(jFilippenko et al.ll200ll iGane shahngam et al."201 QD, the 
CfA monitoring campaign ( JRicss et al. 1999; Jha et aO 
|2006at'Matheson et al."2pol THicken et al.,: 2009). SNFac- 
tory feailev et al. 2003), Carneg i e Supernova Project 
Low- z Program (jContreras et al.l 120091: iFolatelli et al.l 
20101). the Palomar Transient Factorv ()Rau et al.ll200a 
Law et aH 120091 ) . and the Panoramic Survey Tele- 
scope and Rapid Response System (Pan-STARRS^^); 
2) the SDSS-II SN Survey in t he intermediat e red- 
shift interval 0. 1 < ^ < 0.3 (|Frieman et al.l 120081 : 
iSako et all 120081 ): 3) the highest-redshift range ob- 
servable from the ground at 0.3 5. -^ 5. 1 by the 
Legacy Survey (SNLS; Astier et ahl 120061: 
2010 : Conlev et al. 20 11^^ the ESSENCE 

m 



Supernova 
iGuv et aTl 

SN S urvey ([Miknaitis et al.l l2007t iWood-Vasev et al.l 
I2007D. the Carnegie Su pernova Project High-z Program 
(jFreedman et al.ll2009D : and finally 4) z > 1 SN la from 
space using the Hubble Space Telescope (jRiess et al.l 
l004a, 2007t iDawson et al.ll2009l ). 

Many fu ture surveys, suc h as the Dark Energy Survey 
(DES; Flaugher et al.l 20101) and the Large Synoptic Sur- 



vey T elescope fLSST: lLSST Science Collaborations et al.l 
l2009f ). with deeper and more wide-field imaging capa- 
bilities will probe much larger volumes of the universe 
allowing discoveries of thousands to tens of thousands 
of high-redshift SN candidates each year. Spectroscopic 
follow-up observations of these large, faint SN samples 

^^' http: //pan-Starrs, ifa.hawaii.edu/public 
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will require prohibitively large time allocations with ex- 
isting instruments. Studies of SN properties and cosmol- 
ogy will, therefore, necessitate a photometric determina- 
tion of the SN type, cosmological redshift, and the lumi- 
nosity distance from light curves with possibly a limited 
subsample with spectroscopic confirmation and redshift 
measurements . 

Various methods for photometrically classifying SN 
have been discussed in the literature. Optical 

and UV colors near maximum light, for example, 
have been u sed to d i stingu i sh SN la from core- 
collapse SN (IPskovskiil Il977t IPoznanski eraLl 120021: 



Panagial l20a_. . 

20061) IPoznanski et al.l 



Riess et all l2004bt iJohnson fc CrottsI 
(|2007aD 



have developed a 
Bayesian method that classifies SN using only a single 
epoch of photometry (see als o, iKuznetsova fc ConnoUvl 
I2007t iRodnev fc Tonrvl 120091 ) . Template-fitting meth- 
ods have been employe d for spectroscopic targetting of 
activ e SN candidates ( Sullivan et al.l 120061 : iSako et all 
120081 ). iSullivan et all (12006D have performed an anal- 
ysis to identify a sample of photometric SN la can- 
didates from the firs t year of the Supernova Legac y 
Survev. iDahlen et all (120041) . IPoz nanski e t al. (2007bD. 
iDahlen et a l.' (200 81). JPildav et all (l2008ll. iDildav et al' 
20101 ) ■ IRodnev fc Tonrvl (|2010bD . and iGraur et al 



(|2011f ) have also used photometric classification to mea- 
sure SN rates as a function of redshift. 

Although an efficient photometric SN classifier is cru- 
cial for a successful spectroscopic follow-up program and 
also for understanding the bias in the spectroscopic sam- 
ple, the ability to estimate both the efficiency and pu- 
rity of the selected sample is also important for under- 
standing, for example, possible biases in distance mea- 
surements and studies of SN rates. Clearly, the efficiency 
can be improved by compromising purity, and vice versa, 
and the requirements may vary depending on the type of 
study involved. 

In addition to photometrically identifying SN la can- 
didates, redshifts as well as luminosity distances can 
be inferred from the same multi-band light curve 
data. These studies of SN c osmology withou t syec - 
troscopy have been pioneered by iBarris fc Tonrvl ()2004[ ) 
and c arried out more recently by a number of au- 
thors. i Palangue -Dclabrouille et al. (120101 ) . iKessler et al.l 
(|2010al) . and Rodncv l^ Tonrvl ((20101 for example. 



study the quality of photometric redshifts on large sam- 
ples of existing data. Rodney fc Tonry (2010a) also 
construct a photometry-only Hubble diagram of the 
first-year SDSS-II and SNLS spectroscopically-confirmed 
SN la using their Supernova Ontology with Fuzzy Tem- 
plates (SOFT) method. Others show comparisons of 
meas ured and input redshifts primarily from simula- 
tions (iKim fc Miguel 2007: Kunz ct al. 2007': Wang ct alj 
ImrA rWang 2007: Gong ct al. 2009: Scolnic et al. 2009 ). 
The accuracy and precision of the measured param- 
eters depend on many observational factors including 
the statistical quality of the observed light curves, sur- 
face brightness of the underlying host galaxy, photomet- 
ric calibration, wavelength coverage, the number of fil- 
ter bandpasses, and the observing cadence. Other non- 
observational factors that might affect the measurements 
are the quality of the light curve models, assumptions on 
the dust properties and intrinsic SN colors, as well as pri- 
ors used in the fits. The photometric redshift uncertainty 



on any individual SN is obviously larger than a typical 
spectroscopic redshift error, but a substantially larger 
number of unbiased redshift and distance measurements 
made possible photometrically might be able to provide 
competitive constraints on cosmological parameters with 
future large-scale surveys. 

Some of the existing softwares and algorithms, includ- 
ing the one presented in this paper, were recently used 
to participate i n the Supernova Pho tometric Classifica- 
tion Challenge (jKessler et al.ll2010b[ ). a public competi- 
tion for classifying SN light curves. The authors of the 
challenge released a large number of simulated SN light 
curves of undisclosed types and a small "spectroscopic" 
sample with known redshifts and types for training. Par- 
ticipants of the challenge submitted their classifications 
as well as photometric redshifts if available. The algo- 
rithm presented here achieved the highest overall figure 
of merit, though there is significant room for improve- 
ment. 

This paper focuses on understanding these issues us- 
ing an improved implementation of existing methods and 
through analysis of a much larger sample of SN candi- 
dates for testing. We use the three- year SDSS-II SN Sur- 
vey data as our test bed to identify photometric SN la 
candidates with realistic estimates of sample purity. The 
description of the photometric classification algorithm 
and the spectroscopic and photometric SN samples from 
SDSS-II are presented in ^ The procedures for esti- 
mating the SN la typing efficiency and purity using the 
spectroscopic sample are described fJ3] and 311 The prop- 
erties of the photometric SN la candidates identified are 
described in fJS] The quality of the light curve photo- 
metric redshifts is discussed in Sj6| Comparisons with 
simulations are shown in fjTl Finally, our results are sum- 
marized in SJ5| 

2. THE SDSS-II SN CANDIDATES 

The SDSS-II SN Survey was conducted during the 
September - November months of 2005 - 2007. A 
300 deg^ region along the celestial equator was ob- 
served using the SDSS 2.5m teles cope ( Gunn et al. 19981 
iFukugita et aniT996l: lYork et al.ir2000: Gunn et al. 2006 



with an average cadence o f four days (jFrieman et al.l 
120081 : lAbazaiian et al.ll2009l ). The survey depth and area 
are optimal for discovering and measuring light curves 
of SN la at intermediate redshifts {0.1 < z < 0.4), com- 
plementing other surveys. During the search campaigns, 
new variable and transient sources detected in the dif- 
ference images were designated as "SN candidates" . Af- 
ter each night of imaging observations on the SDSS tele- 
scope, the SN candidates were photometrically classified 
based on the available multiband light curves, and a sub- 
set of the events were observe d spectroscopical ly close 
to their moment of discovery (jSako et al.ll2008l ). Pho- 
tometry and results from follo w-up spectroscopy fron i 
the first season are pre sented in iHoltzman et al] (120081 ) 
and lZheng et al.l (|2008D . respectively, and measurements 
of the cosmological parameters from the first-year sam- 
ple and studies of the sources of system atic uncertainties 
are pr esent ed in Kcsslcr ct al. (2009^, iSoUerman et al.l 
(|2009f) . and lLampeitl ct al., (,2009, ). 

Over 10000 SN candidates were discovered during the 
three-year SDSS-II SN Survey, and the majority of these 
candidates are spectroscopically unconfirmed due to lim- 
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TABLE 1 

Core-collapse SN Templates 



Type 


Subtype 


lAU Name 


SDSS ID 


Ibc 


lb 


SN2005hl 


2000 




lb 


SN2005hm 


2744 




Ic 


SN2006fo 


13195 




lb 


SN2006JO 


14492 


II 


II-L/P 


SN2004hx 


18 




II-P 


SN20051C 


1472 




II-P 


SN2005gi 


3818 




II-P 


SN2006J1 


14599 



ited spectroscopic resources. The goal of this paper is 
to photometrically identify the SN la candidates, and to 
estimate the efficiency and purity of that photometric 
classification. We investigate whether reliable cosmolog- 
ical measurements can be performed from SN candidates 
without spectroscopic confirmation. We first describe the 
SN classification algorithm below, and then discuss our 
method for estimating the efficiency and purity using a 
limited number of spectroscopically-confirmed SN. 

2.1. Photometric SN Classification Algorithm 

The candidates are classified using a light curve anal- 
ysis software called "Photometric SN IDentification" 
(PSNID), which is an extended version of the software 
used for prioritizing spectroscopic follow-up observations 
for th e SDSS-II SN Survey as described in ISako et all 
(12008')^^. Extensive tests were performed using the 
publicly-available SNANA light curve simulations^^ as 
well as the data presented here. PSNID was also used 
to analyze simulations from the Supernova Photometric 
Classification Ch allenge and achieved the highest over- 
all figure of merit iKessler et al.l ()2010b l. hereafter KlOb). 
Briefly, the software uses the observed photometry, cal- 
culates the reduced x^ iXr — X^ per degree of free- 
dom) against a grid of SN la light curve models and 
core-collapse SN (CC SN) templates, and identifies the 
best-matching SN type and set of parameters with, and 
without, host galaxy redshift as priors in the grid search. 
A number of important improvements have been made, 
which are described below. 

First, in addition to finding the light curve model with 
the minimum Xr through a grid search, the software com- 
putes the Bayesian probabilities that a candidate could 
be a Type la, Type Ib/c, or a Type I I SN. T he algorithm 
is similar to that of iPoznanski et all (|2007aD except that 
we subclassify CC SN into Types Ib/c and II using an 
extended set of templates (see below), and also allow the 
SN la light curve shape parameter and distance mod- 
ulus to vary in the fits. Specifically, we calculate the 
Bayesian Evidence E by marginalizing the product of 
the likelihood function and prior probabilities over the 
model parameter space. For the SN la models, there are 
five model parameters - redshift z, V-band host galaxy 
extinction Ay , time of maximum light Tmax, ^77115(5) 
(|Phillipsl[T99a IPhillips et al.l 119991 ) . and distance mod- 
ulus /i. Milky Way extinction is modeled assuming the 

^^ The software is included in the SNANA Package 
HKessler et al.|[2009bl) . A standalone version is also available di- 
rectly from the author. 

1* http://sdssdp62.fnal.gov/sdsssii/SIMGEN_PUBLIC/ 



iCardelh. Clavton. fc MathisI (|19890 law with Ry = 3.1, 
while extinction in the SN host galaxy assumes a total- 
to-selective extinction ratio of i?y = Ay /E{B — V) = 2.2 
(jKessler et al. 2009a). Priors in Ay, Tmax, and /i can also 
be applied optionally, but we set them to be fiat in this 
present work. For the redshift, we evaluate each light 
curve twice using 1) a flat prior and 2) a gaussian prior 
if an external redshift estimate Zoxt and uncertainty Uz 
are available from either the host galaxy (photometric or 
spectroscopic redshift) or the SN spectrum. The SN la 
Bayesian evidence is therefore, 

Eia = / P{z) e~^ /^ dz dAy dT^^x dAmis^B d^, 

Jail parameters 

(1) 

where, 

P{z) = J: e-(^-^cxt)V2<T,?^ (2) 

When an external redshift is not available, we assume 
the prior to be flat by setting P{z) = 1. For the SN Ib/c 
and SN II models, the integral over Atoi5(_B) is replaced 
with a summation over the individual templates used in 
the comparison, 



-El 



Ibc. II — 



E IPMe-:^'",.M.,T, 



dfi. (3) 



templates 



The Bayesian probability of one of the three possible SN 
types is then given by, 



Pt 



E, 



type 



type 



^la + -Elbe + Ell 



(4) 



The probabilities Ptype and minimum Xr values calcu- 
lated using the gaussian spectroscopic redshift prior are 
denoted with a subscript z (i.e., -Pz,type and xlr)- ^^~ 
ternal photometric redshifts of the host galaxies are not 
used in the flts in this work. The probabilities are nor- 
malized such that. 



Pia + Pi 



Ibe 



Pit = 1, 



(5) 



which is equivalent to assuming that the SN candidate is 
a real SN and not another class of variable sources. This 
assumption is reasonable, since sources in Stripe 82 with 
a prior history of variability and other multi-year vari- 
ables are rejected from our analysis (Sako et al. 2008). 
This set of Bayesian probabilities is useful because it 
quantifles the relative likelihood of SN types - the best- 
flt minimum Xr alone is not a g ood indicator of the most 
likey S N type. As advocated bv iKuznetsova fc Connollvl 
(j2007f ). we therefore select SN la based on both the 
Bayesian probability Pia and the goodness-of-fit Xr ■ 

Next, although the SN la light cu rve models used he rein 
are the same as those described in ISako et al.l (J2008D . we 
have assigned empirical model errors that yield reason- 
able Xr values for light curves with high S/N ratio. The 
assumed magnitude errors Sm on the gri model light 
curves depend on the rest-frame epoch t in days from 
P-band maximum as follows. 



Sniia = 



0.08 + 0.04 X (|i|/20) 

0.12 -I- 0.08 X ((|t|-20)/60) 



|i| < 20 days, 
\t\ > 20 days. 



(6) 



Sako et al. 



'S 
so 



-§ 

d 



-20 



-15 



-10 



-20 



-15 



-10 




-20 



-15 



-10 



-20 



-15 



-10 







50 




epoch (days) 


- 


1 1 1 1 1 1 1 
2006JO (Type Ibc) 


(^ 


u g r 1 _ 


^^^^ 




1 1 1 1 1 1 1 



epoch (days) 



50 
epoch (days) 



Fig. 1. — Absolute magnitude light curves of SN Ib/c discovered and observed by SDSS-II, which are part of the template library - SN 
2005hl (top left), SN 2005hm (top right), SN 2006fo (bottom left), and SN 2006jo (bottom right). [See online version for color figures.] 



The CC SN light curve templates have error in gri given 

by, 

(5tocc =0.08 + 0.08 X (|t|/60) (7) 

for all epoch. The model errors in u and z are cho- 
sen to be twice the above values due to larger intrinsic 
model variations and calibration uncertainties in these 
bands. These 5m parameters were determined to pro- 
vide reasonable x^ values (x^ ^1) primarily for nearby 
SN candidates with small photometric errors. They do 
not affect the fit results of faint candidates. 

Third, we adopt CC SN light curve templates from 
a sample of nearby SN discovered and observed by 
SDSS-II. Specifically, we use four SN Ib/c templates 
and four SN II templates as listed in Table [T] The 
SDSS-I I CC SN light curve templates were generated us- 
ing the INugent. Kim. &: Perlmutteil (|2002D spectral tem- 
plates, interpolating between epochs, and warping them 
to match each of the observed ugriz light curves at their 
respective spectroscopic redshifts. For all SN Ib/c, we use 
Nugent's normal Ibc spectral templates, and we use the 
Type II-P templates for all S N II. The SN II li g ht cu rve 
photometry are available from lD'Andrea et al.l (|201C1( ). 

The set of eight core-collapse templates listed in Ta- 
ble [T] were selected from a larger group of 24 templates 
(5 Nugent, 11 SDSS-II, and 8 from the SUSPECT^^ 
database) by empirically maximizing the purity of the 
confirmed SN la sample. Core-collapse templates that 
either frequently misidentify SN la as CC SN or correctly 
identify only a small number of confirmed CC SN were 
excluded. Rare, peculiar SN la are also excluded from 
our template library. We also do not include templates 

^^ http://bruford.iiliii.ou.edu/~suspect/indexl .html 



for other types of variable sources, most notably the ac- 
tive galactic nuclei (AGN), since there are other ways of 
rejecting the majority of these events. The rest-frame 
absolute magnitude ugriz light curves of the eight CC SN 
used as templates in this analysis are shown in Figures [1] 
andU 

Finally, while the Bayesian classification probabilities 
are computed through marginalization over the grid of 
the model parameters, the posterior probability distri- 
butions for each of the five parameters are estimated by 
running a Markov Chain Monte Carlo (MCMC). This re- 
sults in a significant reduction of computing time and 
more reliable estimates of the parameter uncertainties, 
since the probability distributions are often asymmetric, 
show significant correlations, and can often have more 
than one local maximum. It is also straightforward to 
incorporate additional model parameters and priors. 

Figure [3] shows an example output from PSNID for 
a spectroscopically-confirmed SN la, 2006jz at z = 
0.20. Derived parameter constraints from the MCMC 
are shown for both the flat and spectroscopic redshift 
priors. There are two general points that are worth not- 
ing. First, z and Ay are anti-correlated in the sense that 
a low-z, high-^v SN la is similar to a high-z, low-Ay 
event. This is expected, since redshift and dust both 
have the effect of reddening the light curves. But since 
dust also attenuates the light, a larger Ay value must 
be compensated for by putting the event at a smaller 
distance modulus. This happens in the way such that 
z and /I, marginalized over the other three parameters, 
are positively correlated. The slope of this correlation is 
redshift-dependent. Second, the widths of the marginal- 
ized /i and Ay probability distribution function (PDF) 
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Fig. 2. — Absolute magnitude light curves of SN II discovered and observed by SDSS-II, wfhich are part of the template library - SN 
2004hx (top left), SN 20051c (top right), SN 2005gi (bottom left), and SN 2006J1 (bottom right). [See online version for color figures.] 



for the flat redshift prior are only a factor ^ 2 larger 
than those for a spectroscopic redshift prior. This gen- 
eral behavior is true for most of our well-observed SN la, 
although the constraints using a flat-z prior degrades dra- 
matically at higher redshifts, as shown in Figure |4] for a 
z = 0.30 confirmed SN la 2005it. 

2.2. Confirmed and Unconfirmed Samples 

We first divide the full sample of SN candidates into 
two groups - the spectroscopically confirmed and un- 
confirmed samples. The unconfirmed sample consists 
of sources of unknown type with no spectroscopy of 
the active SN candidate, but a subset of the events 
do have spectroscopy of their host galaxies. The 
spectroscopically-confirmed sample consists of SN la, 
SN Ib/c, SN II, as well as variable AGN. This sample 
is used to study the classification criteria and also al- 
lows us to estimate the selection efficiency and purity, 
which is a crucial part of our analysis. The ugriz multi- 
band light curves of all SN candidates are constructed 
using the Scene-Mod eling Photometry method (smp; 
iHoltzman et all l2008n and analyzed using the PSNID 
software described above. 

The full SN sample is analyzed with PSNID, and we 
select the candidates that have light curve coverage and 
signal-to-noise (S/N) ratio that are appropriate for pho- 
tometric SN la classification. Specifically, we consider 
only the candidates that meet the following three cri- 
teria: (1) Have at least one epoch of photometry near 
peak at — 5 < i < -1-5 days in the SN rest frame and at 
least one additional epoch after peak at t > +15 days, 
which are determined from to the best-fit SN la model, 
irrespective of whether or not the fit is acceptable; (2) 



Have maximum S/N ratio greater than five in at least 
two of the gri bands, and; (3) Were detected during only 
one search season. These cuts are referred to as the light 
curve quality cuts. 

The spectroscopically-confirmed sample consists of 508 
SN la, 80 CC SN (18 SN Ib/c, 62 SN II), and 202 AGN^o. 
We refer to these as the "conf-Ia" , "conf-CC" , and the 
"conf-AGN" samples. After imposing the light curve 
quality cuts, this sample is reduced to 367 SN la, 45 
CC SN, and 83 AGN, for a total of 495 events when a flat 
spectroscopic redshift prior is used. Using the spectro- 
scopic redshift prior results in 551 events. The numbers 
differ since the two forms of the redshift priors can result 
in best-fit SN la models with dramatically different dates 
of maximum light, especially for the AGN. 

There is a significant bias in the spectroscopically- 
confirmed SN sample toward brighter events. For the 
SDSS-II SN Survey, our primary goal was to discover and 
study the properties of SN la, so only a small fraction 
of CC SN candidates were observed for spectroscopy. A 
detailed study of the impact on photometric SN la typ- 
ing due to contaminating sources is, therefore, limited by 
this small number of spectroscopically confirmed CC SN. 

To help quantify this bias, we identified the SN candi- 
dates that are associated with galaxies wit h spectra from 
the SDSS spectros c opic survey (Eiscnst ein et al.l 120011 : 
iStrauss et all 120021 : [Richards et al..,2002, ). These galax- 
ies have well-defined selection criteria and, as we describe 
below, will help quantify the spectroscopic targeting bias 
and to obtain a better estimate of the level of contami- 

'^° Of the 202 AGN, 58 are in the DR7 spectroscopic quasar 
catalog from [Schneider et al.l l|201CI ). 
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Fig. 3. — An example of the posterior probability distribution functions (PDF) for a spectroscopically-confirnied SN la 2006jz at 2 = 0.20. 
The observed ugri light curves and the best-fit SN la model are shown on the bottom right panel. The top-left and middle-left panels 
show 1- and 2-cr contours in the z-fi and z-Ay planes, respectively, assuming a flat redshift prior. The X indicates the median parameter 
values when a spectroscopic-redshift prior is used. The two panels on the right and the bottom-left panel show the posterior PDF in fi, 
Ay, and z marginalized over the other 4 parameters using the flat (black) and spectroscopic (gray) redshift priors. 



nation from non-SN la events. There are a total of 2369 
SN candidates that are within 10" from an SDSS spectro- 
scopic galaxy. This sample is referred to as the "zsdss" 
sample. After light curve quality cuts, there are 448 and 
499 sources for the flat and spectroscopic redshift priors, 
respectively, which includes both confirmed and uncon- 
firmed SN candidates. The majority of the sources are re- 
jected because of their multi-year variability, suggesting 
that these sources are likely variable AGN whose nuclear 
activity is not immediately apparent from their optical 
spectra. The samples are summarized in Table [21 The 
redshift distributions of the four different spectroscopic 
samples are shown in Figure [5l 

The unconfirmed sample consists of a total of 3221 can- 
didates that pass the same light curve quality cuts. Of 
these 3221 candidates, 2776 have no spectroscopic obser- 
vations, while the remaining 445 candidates are either 



part of the zsdss sample described above (230 candi- 
dates) or have host galaxy redshifts from our own follow- 
up observations (215 candidates). 

A histogram of the maximum r-band S/N of this sam- 
ple is shown in Figure [Bl The mean S/N of ~ 30 for the 
spectroscopic sample is substantially higher than that of 
the photometric sample, which has a mean S/N of ~ 10. 
The implications of this difference are discussed in § [51 

3. SN CLASSIFICATION FIGURE OF MERIT 

Since our goal here is to identify SN la, we define the 
photometric typing efficiency eja as the fraction of SN la, 
after software S/N light curve quality cuts, that are pho- 
tometrically identified as SN la. Letting Afil^"^ be the 
number of true SN la photometrically identified as SN la 
and A/'ja^"'" be the total number of SN la in the sample 
after the light curve quality cuts, we define the photo- 
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Fig. 4. — Same as in Figure [3] for a spectroscopically-confirmed SN la 2005it at 



: 0.30. 



metric SN la selection efficiency to be, 



eia 



K 



_^CUT- 



(8) 



Note that this is not the true SN la identification effi- 
ciency since the denominator Af^^^"^ includes only the 
events that pass the S/N and light curve quality cuts. In 
terms of the total number of SN la (A/^™''") that were 
detected in the area observed by the survey, 



K 



CUT 



ecuT 



'J^. 



OT 



(9) 



where ecuT is, in general, a function of z. Ay, Amic^^B), 
peak magnitude, time of maximum light, software detec- 
tion threshold, requirements on light curve S/N and tem- 
poral coverage, as well as the observing conditions. The 
determination of the value of ecuT is beyond the scope 
of the paper, but the effect of our selection cuts can be 
modeled using the SNANA Package. 



Adopting the convention similar to that used in evalu- 
ating the SN Photometric Classification Challenge (here- 
after SNPhotCC; KlOb) we define the photometric pu- 
fity Via s-s the fraction of the candidates identified as 
SN la that are actual SN la with a penalty factor W^l;^'"' 
described below. Letting A/'/f'^'"' be the number of non- 
SN la incorrectly identified as SN la, the photometric 
purity of the sample is. 



Via 



^iT 



^ir+Er^tr^tf 



(10) 



where the sum in the denominator allows for several 
classes i of contaminating sources (e.g., CC SN, AGN, 
and variable stars) possibly with different penalty fac- 
tors. We define a figure of merit (CpoM-ia) as. 



C 



FoM-Ia — eja X rji^ 



(11) 

This definition of CpoM-ia is designed for real data 
and differs from the pseudo-purity from the SNPhotCC 
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TABLE 2 
The SDSS-II Spectroscopic Sample 



Type 


Total 


Flat Redshift Prior 
Good'' Pl^ > 0.9 Pla 


<0.1 


Spectroscopic Redshift Prior 
Good^ Pl^ > 0.9 i^a < 0.1 


Confirmed SN la 
Confirmed CC SN 
Confirmed AGN 


508 

80 

202 


367 
45 
83 




357 
14 
32 




2 
30 

44 


371 

45 

135 




366 
11 
86 


1 
32 

46 


SJN with zsDSS 


2369 


448 




248 




159 


499 




317 


150 


Total 


3159 


732 




539 




201 


788 




599 


163 








1 1 1 1 1 1 1 


8 


: 


spec-CC J 


o 


7 


L] : 




T 


, 1 ■■! 1 , 1 , ■ 



0.2 0.4 0.6 0.8 




'^ This sample includes SN that satisfy the following photometric quality criteria: (1) There 
is at least one epoch of photometry at —5 < t < +5 days from peak and another epoch at 
+5 < t < +15 days from peak for the best-fit SN la model; (2) There is at least two filter 
measurements with S/N > 5; (3) The candidate was detected in only a single search season. 

mal measure for all types of studies. Higher SN la purity 
might be more important than efficiency for certain stud- 
ies, and vice versa. Finally, we define the contamination 
Kja as, 

Kla = 1 - rjia- (12) 

These quantities determined with the spectroscopic red- 
shift prior are designated with a subscript z. 

To give a simple numerical example, consider a survey 
that is capable of detecting 100 SN la that pass S/N and 
light curve quality cuts. A photometric classifier that 
identifies 90 candidates as SN la, of which 10 are actually 
non-la events has an efficiency of eja = 80/100 = 0.80, 
purity of ?7ia ~ 80/90 — 0.89, and contamination of 
Kia = 1 — 0.89 = 0.11. In practice, however, these 
quantities can be determined only for the spectroscop- 
ically confirmed SN sample for which the correct type is 
known. The efficiency, purity, or some combination of 
these two parameters can be optimized by choosing the 
appropriate values for Pja and Xr- If the spectroscopic 
sample is an unbiased representation of all of the SN can- 
didates, then one can expect the efficiency and the purity 
of both the spectroscopic and photometric samples to be 
the same within statistical uncertainties. However, this is 
almost never the case in a blind SN survey given limited 
spectroscopic resources. SN candidates that are brighter 
and/or suffer less host galaxy contamination will have 
higher spectroscopic success and completeness. This is 
illustrated in Figure |6l which shows that the light curve 
peak S/N of the spectroscopic sample is on average a fac- 
tor of '-^ 3 higher than that of the photometric sample. 
Below we describe a method to correct for this bias and 
to estimate the efficiency and purity of the photometric 
sample using a limited and biased spectroscopic training 
set. 



Fig. 5. — The redshift distributions of the conf-Ia (top left), conf- 
CC (top right), conf-AGN (botto m left ), and zsDSS (bottom right) 
samples used in our studies (see § 12.21 for descriptions) . The solid 
and dashed histograms represent the samples that pass our light 
curve quality cuts with the flat and spectroscopic redshift priors, 
respectively. The redshift bins are Az = 0.05 wide. 
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Fig. 6. — The distributions of maximum r-band signal- to- noise 
ratio (S/N) of the spectroscopically-confirmed SN candidates 
(dashed) and photometric candidates (solid) considered in this 
work. The spectroscopic sample has an average peak S/N of ~ 30 
while the photometric sample has average S/N of ~ 10. 

by the unknown factor 1/ecuT, i-e., CpoM-ia 
C 



SNPhotCC 
FoM-Ia 



/ecuT- KlOb also define the true purity to 
be the case for Wl^^^"^ = 1. This figure of merit is only 
one measure of success, and it is not necessarily the opti- 



. ESTIMATING THE EFFICIENCY AND PURITY 
SN la Identification With Spectroscopic Redshifts 

We first estimate the efficiency and purity of photomet- 
ric SN la identification when spectroscopic redshifts are 
used as priors in the light curve fits. We determine J^^^ia 
and J^^'ia^ from the spectroscopic SN la and CC SN and 
how they depend on the minimum Pzja and the maxi- 
mum allowed x1 r- This is relevant for future SN surveys 
that will, for example, obtain spectra of all SN candidate 
host galaxies after the search, but not spectra of all the 
active SN candidates. The values for Pz^ia and xl.r ^^^ 
shown in Figure [7] separately for the spectroscopically 
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spectroscopically-confirnied SN la (top panel) and CC SN. Spec- 
troscopic redshifts are used as priors in all of the fits. 
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Fig. 8. — Histograms of best-fit xi r "values for a SN la model 
for Pja > 0.9 (black) and Pj^ < 0.1 (gray) for the spectroscopically 
confirmed SN la (top panel) and CC SN (bottom panel). 

confirmed SN la and CC SN samples. 

As shown in the top panel of Figure [3 all but a hand- 
ful of SN la are well fit to a SN la model. Of the 
M^i^ = 371 spectroscopic SN la that pass the Hght 
curve quality cuts, 366 sou rces have Pr,^n. > 0.9. Only 
a single SN la (SN 2007qd; iMcClelland et all 2010) has 
Pz,\a. < 0.1. This event is a nearby peculiar 2G02cx-like 
event, which is underluminous compared to nor mal SN la 
and has an extremely lo'^ expansion velocity (|Li et al.l 
[2003; ,Jha et al... 2006b) . There are other nearby peculiar 



2005gi lAldCTing et al.ll2061lPrieto et al.ll2007[ ). but these 
candidates were detected over two search seasons due to 
their brightness and slow decline, and were, therefore, 
rejected. The bottom panel of the same figure, how- 
ever, shows that a substantial fraction of the spectro- 
scopic CC SN also satisfy Pzja > 0.9 implying that the 
contamination can be significant depending on the maxi- 
mum allowed Xz r value used for the SN la identification. 
Specifically, 11 out of the 45 CC SN (24%) that satisfy 
our light curve quality cuts have Pz.ia > 0.9. If no other 
cuts are invoked, then Ml]^^ = 366 and Ml^^l^ = 11. We 
also note that the majority of the sources have either 
Pz,ia ~ or Pz,ia ~ 1, so both 7V;'_™° and Ml^'l^ are not 
sensitive to the precise choice of the minimum Pz,ia- 

Before determining how A/"*";"® and A/'^^^" depend on 
the choice of the maximum xi r i ^^ note that 5 of the 
11 CC SN with Pzja > 0.9 can be rejected by requiring 
the light curve photo-z (zic), using a flat redshift prior, 
to be within 3a of the spectroscopic redshift Zgpcc; i.e., 
\z\c ~ Zspcc\/o'z < 3. We reject candidates that fail this 
cut, and show the distributions of the xi r values for the 
SN la and CC SN in Figure [8] for P^ja > 0.9 and P^ja < 
0.1. Of the 366 SN la and 11 Cc'sN with good light 
curves and Pzja ^ 0.9, 22 and 5 candidates, respectively, 
are rejected by this requirement on redshift agreement. 
Therefore, there are only 6 CC SN that satisfy all SN la 
selection cuts. 

In the last step, we estimate the unknown factor W^^ja", 
which can be interpreted as a penalty factor for spectro- 
scopic incompleteness and targeting biases. The SDSS-II 
SN Survey follow-up strategy was to observe the "good" 
SN la candidates at higher priority than the CC SN can- 
didates, especially for the fainter (r > 20.5 mag) sources 
due to limited spectroscopic resources. A simple inter- 
pretation of this factor is that if our follow-up strategy 
had instead been to observe a random sample of SN can- 
didates, then we would have spectroscopically identified 
W^j'r times more CC SN. 

One way to estimate this bias factor is to select a sub- 
sample of SN candidates with spectroscopic redshifts, 
which is representative of the underlying distribution 
of the SN types. The ratio of these candidates with 
Pz, la < 0.1 to those with Pz ja > 0.9 can then be inter- 
preted to be approximately the ratio of CC SN to SN la 
in our survey. 

This can be done by considering the SN candidates 
in galaxies with redshifts from the SDSS spectroscopic 
survey, which has a set of well-defined sele ction criteria. 
We id entify ca ndidates in the main g alaxy (jStrauss et al.l 
120021 ) ■ quasar (iRichards et al.l 120021) . and the Luminous 
Red Galaxy (LRG; Eiscnstci n et al.l l 20 01) samples. The 
LRG sample is several magnitudes deeper than the main 
galaxy sample and consists primarily of passive galax- 
ies with old stellar populations, which do not host any 
CC SN. We include this sample to account for the fact 
that SN la are also on average a few magnitudes more lu- 
minous than CC SN, so a magnitude-limited survey will 
discover many more SN la than CC SN. The distribu- 
tions of Xz r fo'' Pz.ia > 0.9 and Pzja < 0.1 for candi- 
dates in the SDSS galaxy spectroscopy sample with \z\c 
~ -^specl/cz < 3 are shown in Figure [H The ratio of 
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Fig. 9. — Histograms of best-fit Xr values for a SN la model for 
PzjuL > 0.9 (black) and P^ja < 0.1 (gray) for the SN candidates in 
SDSS spectroscopic galaxies using the redshift as a prior. 
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Fig. 10. — The efficiency, purity, and figure of merit for the 
spectroscopically-confirmed SN as functions of the maximum- 
allowed xl,r for P^ja > 0.9. 

the number of candidates with P^ la > 0.9 to those with 
PzM < 0-1 is 197/56 = 3.5 compared to 350/11 = 32 
for the combined spectroscopic sample shown in the bot- 
tom panel of Figure HI The bias (penalty) factor for the 
spectroscopic sample can, therefore, be estimated to be 
^z'la" = 32/3.5 = 9.0. An unbiased spectroscopic fohow- 
up strategy would have resulted in IV^^^ia" = 9-0 times 
more contaminating CC SN for SN la identification. 

We use this penalty factor to calculate Czja and r/^ja 
as functions of the maximum xl r- The expression for 
the purity is. 
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Figure [10] shows how ezja, ?7z,ia, and Cz,FoM-ia depend 
on the maximum-allowed x1 r fo^ -fz.ia > 0.9. The figure 
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Fig. 11. — The distributions of Pja and Xr values for the spec- 
troscopically confirmed SN la (top panel), CC SN (middle panel), 
and AGN (bottom panel). The fits were performed with a fiat 
redshift prior. 

of merit has a broad maximum value of Cz.FoM-ia ^ 0.84 
at approximately xl r — I-^j where the efficiency and 
purity are ^ 89% and ^ 94%, respectively. A caveat 
to the estimate of ?7zja is that it is based on only six 
confirmed CC SN that pass our SN la selection cuts. 

4.2. SN la Identification without Spectroscopic Redshifts 

We next determine A/"/™" and N^^^^ when no exter- 
nal redshift information is available to provide additional 
constraints in the light curve fits. Here we have an addi- 
tional source of contaminating sources - variable AGN - 
which can be identified if either the galaxy spectrum is 
available or the candidate is variable over a long period 
of time (> 1 year). We use the confirmed SN and the 
AGN samples discussed in § 12.21 to determine how the 
efficiency, purity, and figure of merit depend on the min- 
imum Pia and the maximum allowed x^ using the flat 
redshift prior. The three panels in Figure [TT] show the 
Pia and Xr values for the spectroscopic SN la, CC SN, 
and AGN samples. As with the previous case, most of 
the spectroscopic SN la are clustered near Pja ~ 1 and 
Xr ~ 1 indicating that they are well- fit to SN la models. 
There are also a handful of CC SN and AGN with Pja -- 1, 
however, so the amount of contamination can again be 
substantial depending on the maximum allowed Xr- 

We also show in Figure [H] histograms of the Xr values 
for the same sources for Pja > 0.9. Of the M^^"^ = 367 
spectroscopic SN la that pass our light curve quality cuts, 
357 sources have Pia > 0.9. There are also 14 CC SN and 
32 AGN with Pia > 0.9. 

For estimating ?7ia, we apply the penalty factor only 
on the CC SN sample where the bias is more significant. 
Almost all of the spectroscopic AGN confirmation came 
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Fig. 14. — The efficiency, purity, and figure of merit for the 
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Fig. 12. — Histograms of best- fit Xr values for a SN la model 
for P[a > 0.9 (black) and P[a < 0.1 (gray) for the spectroscopically 
confirmed (from top to bottom) 1) SN la, 2) CC SN, 3) AGN, 
and 4) all three samples combined. Note that the vast majority 
of SN la have P[a > 0.9. The contaminating false-positives are 
the CC SN and AGN represented by the blaclc histograms with 
^la ^ 0-9i a-iid there are only a small number of those sources in 
our sample. 
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Fig. 13. — Histograms of best-fit Xr values for a SN la model for 
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from SDSS quasar spectroscopy (jRichards et al.l l2002| ) 
and not from our own targeting, so we assume that this 
sample is unbiased. The expression for the efficiency is 
given in EqlS] We write the purity explicitly as, 
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Fig. 15. — Pj.ia vs Xz r for the 445 photometric candidates with 
galaxy spectroscopic redshifts. Candidates that do not meet the 
light curve photo-z cut {\zi^ - Zspac\/oz < 3) are shown as crosses. 
The 210 2host-Ia candidates identified are bounded by the red box 
shown in the lower right. 



where we have assumed 



■\/\;falsc 
'''^la.AGN 



1. The penalty fac- 



tor Wj^^QQ can be estimated from the histograms shown 
in the bottom panel of Figure [TH and Figure [T31 Specifi- 
cally, we have Wf^Jcc = (403/76)/(259/199) = 4.1 using 
the same method as for the case with the spectroscopic 
redshift prior. We show in Figure [U the efficiency and 
purity as a function of the maximum- allowed x^ value. 
Also shown is the figure of merit, which exhibits a broad 
maximum at CpoM-ia = 0-86. At x^ ^ 1-6, the efficiency 
and purity are ^ 92% and ^ 94%, respectively. 

5. SDSS-II PHOTOMETRIC SN lA CANDIDATES 

We now evaluate the light curves of the 445 candidates 
with spectroscopic redshift measurements of their host 
galaxies. Their SN types are unknown because there were 



a, AGN 
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TABLE 3 
SDSS-II ^host-lA Candidates^ 



SDSS ID'' 


RA<^ 


Dec^^ 


•S^spcc 


Av 


Ami5(-B) 


P^.Ia 


xl. 


703 


-23.782080 


+0.650725 


0.3000 ± 0.0100 


04+0.16 

'^•'-'^-0.18 


70+0.13 


1.000 


0.99 


779 


26.673738 


-1.020580 


0.2377 ±0.0005 


^.2ltl-Xl 


ot; + 0.10 


1.000 


0.80 


841 


48.495991 


-1.010015 


0.2991 ± 0.0005 


-o.i7t«:i« 


1.02+0-20 


1.000 


0.99 


1415 


6.106480 


+0.599307 


0.2119 ±0.0002 


omtlf. 


0.76lH^ 


1.000 


0.93 


1461 


24.372675 


+0.209735 


0.3407 ± 0.0005 


o.33J:«:li 


imtr^ 


1.000 


1.07 


1595 


-38.432114 


-0.554060 


0.2136 ± 0.0005 


07+0.09 


l-03l[J:°« 


1.000 


1.56 


1748 


-6.887835 


-0.482495 


0.3397 ± 0.0001 


C2+0.20 


0.8310:^0 


0.996 


1.00 


1775 


-41.006622 


-1.009430 


0.3050 ±0.0100 


-n 97+0.17 


1 26+0-15 
jO.15 


1.000 


1.07 


1835 


-47.335869 


+1.071860 


0.2716 ±0.0100 


iq+0.19 


1 90+0.22 
l-^o_0.20 


1.000 


1.31 



'^ Full table is published in its entirety in the electronic edition of The Astrophysical Journal. / 

portion is shown here for guidance regarding its form and content. 

^ Internal SN candidate designation. 

'^ Coordinates are J2000. Right ascension is given in decimal degrees defined in the range [—180° 

+180°!. 







TABLE 4 
SDSS-II Photo-Ia Candidates'" 








SDSS ID'' 


RA<= 


Dec'^ 


2lc 


Av 


Ami5(B) 


i^a 


xl 


822 


40.560776 


-0.862157 


0.16710-065 


0.51+131? 


1 94+0.14 


1.000 


1.38 


859 


-9.448275 


+0.386555 


o.305t°:[;i 


o.o4t;;-l? 


77+0.13 
0-''-0.09 


1.000 


1.25 


904 


21.095400 


-0.124883 


n 900+0.029 


^■^'^-0.34 


1 in+0.22 

1.1U_Q 17 


0.999 


1.00 


1158 


17.275431 


-0.352185 


cn/1 +0.006 


-0.58to.50 


1 c:c: + 0.19 
i.OO_Q 29 


1.000 


1.01 


1243 


-18.340113 


-0.764753 


n 1SS+0.100 


0.89to.73 


cq+0.09 

u.oy_o 06 


1.000 


1.47 


1285 


-38.216843 


+0.543195 


04^+0.049 


0.2ltg-g 


, Ofi+0.25 


1.000 


1.01 


1302 


53.654808 


+0.891903 


0.28210.039 


-OAOtlil 


n 09+0. 08 


1.000 


1.23 


1342 


-13.472480 


+0.117010 


0.299to.046 


0.03t°-.l? 


1 10+0.15 

l-18_o.i4 


1.000 


0.90 


1354 


-5.197145 


+0.089970 
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^ Full table is published in its entirety in the electronic edition of The Astrophysical Journal. A 

portion is shown here for guidance regarding its form and content. 

^ Internal SN candidate designation. 

•^ Coordinates are J2000. Right ascension is given in decimal degrees defined in the range [—180°, 
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Fig. 17. — Redshift distributions of the three SN la samples. 



Fig. 16. — Pia vs x^ for the 2776 photometric candidates with no 
spectroscopic information. The 860 photometric SN la candidates 
are bounded by the gray box shown in the lower right. 

no spectroscopic observations of these objects. Selection 



with P^ja > 0.90, Xz.r < 1-8, and |zic - Zspccl/ffz < 3 
results in 210 candidates sho-wn in Figure (TSl Based on 
the analysis presented in § 14.11 -we expect this sample 
to have an efficiency of ~ 89%, purity of ~ 94%, and a 
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Fig. 18. — Comparisons of Zapec and zi^ witli a fiat redsliift prior 
for tlie spectroscopic SN la sample. Tlie 387 SN la that pass the 
light curve quality cuts are shown in black while the 210 2host-Ia 
are indicated by gray crosses in the top panel. The bottom panel 
shows the mean A^ = (^ic — 2spcc)/(l + 2spcc) values in black circles 
and the RMS as horizontal bars in bins of 0.05 for the combined 
SN la + 2ijost-Ia samples. The magnitude of the bias |Az| is less 
than 0.02 for z < 0.4. 

figure-of-merit of ~ 0.84. We refer to this sample of 210 
candidates as the "^host-la sample". Their candidate ID, 
coordinates, spectroscopic redshifts, and light curve fit 
results are listed in Table [3l 

From the 2776 candidates with no spectroscopy, iden- 
tifying sources with Pja > 0.90 and Xr ^ 1-6 results in 
860 purely-photometric SN la candidates, which we refer 
to as the "photo-la sample". The selection is shown in 
Figure (THl We expect this sample to have an efficiency 
of ~ 92%, a purity of ~ 94%, and a figure-of-merit of 
0.86. Its redshift distribution is shown in Figure[T7l The 
mean redshift of the photo-la sample is z = 0.31 com- 
pared to z — 0.22 for the spectroscopically comfirmed 
sample. The full list of candidates is provided in Ta- 
ble 111 In addition to their coordinates, we provide the 
photometric light curve redshifts zic marginalized over 
all the other parameters. The reliability of these values 
is discussed in the following section. 

The light curves of these candidates, as well as all of 
the other SN candidates, will be made available soon as 
part of the SDSS-II SN Survey Data Release. 

6. PHOTOMETRIC REDSHIFTS AND DISTANCES 

The light curve redshifts zic are determined by 
marginalizing over the other four model parameters; Ay, 
T,„ax, Ami5(_B), and /i. For each SN candidate, the 
posterior probability distribution function is constructed 
from the MCMC output. The redshifts listed in Table S] 
correspond to the median zic and the ±34.1% (Icr) upper 
and lower limits. 

We compare the spectroscopic redshifts Zgpec with zic 
for the conf-Ia and Zhost-Ia samples and with the host 
galaxy photometric redshifts Zphoto from lOvaizu et al.l 
(|2008r) available in the SDSS DR8 database. As shown in 
FigurefTSl zic and Zgpec are in agreement with jA^j < 0.02 



Fig. 19. — Comparisons of Zspec and Zpjjotoi the photometric 
redshift of the SN la host galaxies fro mjOv aizu et_an i|2008I V There 
are fewer points here than in Figure 1181 because there are many 
SN la with hosts that are below the detection limit of SDSS, and 
some galaxies are classfied as stars and therefore do not have ^photo 
values. The bottom panel shows the mean Az values in black circles 
and the RMS as horizontal bars in bins of 0.05. 



0.05 below 



0.30 and increases to 



ter is A^.RMS 
0.1 at ^ = 0.4. 

The sig n and magnitude to th is bias is similar to those 
found by iKessler et al.l (|2010a[ ). who analyzed a subset 
of the higher S/N SDSS-II SN la light curves presented 
here using both MLCS and SALT-II. In terestingly, a sim- 
ilar bias is seen in their simulations. iRodnev fc Tonrvl 
(2010a) do not quote a value for the bias, but they state 



'Spec 



vs. Zic values 



(A^ = (zic 



that a line with a slope of unity fits the 

for the first-year SDSS-II SN la sample with a xl= 0.98. 

We also show in Figure [H] a comparison of Zspoc 
with the hos t galax y photometric redshift Zphoto from 
lOvaizu et al.l ()2008[ ). Here, there is a nearly constant 
bias of Az ~ 0.03 with an RMS scatter of A^ rms ~ 
0.05-0.10. 

We show in Figure [201 the Hubble diagram of the 350 
conf-Ia, 210 Zhost-Ia, and 860 photo-la samples. Distance 
modulus residuals of the conf-Ia and Zhost-Ia samples rel- 
ative to a simple quadratic fit are shown in Figurc[2TJ For 
the conf-Ia sample, the scatter around the mean Hubble 
relation is a^ = 0.13 mag at z = 0.1 and increases mono- 
tonically to a^ = 0.30 mag at z = 0.4. The same Hubble 
relation was subtracted from the Zhost-Ia sample, which is 
shown in the right panel of Figure 1211 There is a notice- 
ably larger scatter with cr^ — 0.2 — 0.4 mag in the same 
redshift range. This is most likely due to contamination 
from non-la events, which we have estimated to be at the 
level of ~ 6% (approximately 1 out of 16 events in this 
sample is likely to be a CC SN). The slight deviation of 
the mean from zero is not statistically significant. 

The Hubble diagram of the photo-la sample shows ex- 
treme outliers below z ~ 0.1. All of these points are 
significantly above the ACDM Hubble relation, and are 
most likely CC SN that are mis-classified as SN la. In 
fact, the majority of these events are classified by PSNID 



;)/(! -|- Zspoc)) for Zspoc < 0.4, but as extremely-underluminous, high-extinction [Ay ^ 1) 



with a small redshift-dependent bias. The RMS scat- 



SN la. Since the underlying extinction distribution of 
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Fig. 20. — Hubble diagram of the three SN la samples (conf-Ia in black, zj^ost-Ia ™ red, and photo-la in light gray and blue in the 
online color version). The dark gray line represents ACDM. Spectroscopic redshift priors are used for the conf-Ia and zjjost-Ia samples. 
Flat redshifts priors are used for the photo-la sample. The redshift and distance of the photo-la are significantly correlated and their 
uncertainties are not shown for clarity. The outliers at low-z are probably due to CC SN that are mis-classified as high-Av {Ay > 1) 
SN la, which are shown in blue. Note that the majority of these points are significantly away from the ACDM Hubble relation. 



SN la follows the re lation ex e"-^^/^^ with ry ~ 0.33 
(jKessler et alJ [2009al ) . and given the smaller number of 
confirmed SN la in the same redshift interval, it is un- 
likely that all of these outlier events are underluminous, 
high-extinction SN la. Selecting only the candidates with 
Ay < 1 eliminates most of these outliers at the cost of a 
somewhat reduced efficiency, but measurements of their 
host galaxy redshifts will also significantly help distin- 
guish their types. 

At higher redshifts, the mean Hubble relation of the 
photo-la sample is consistent with the conf-Ia and Zhost- 
la samples, but with a significantly larger scatter. Above 
z ^ 0.2, the rms scatter is tr^ ~ 0.5 — 0.7 mag, which is 
about a factor of '--^ 2 larger than the scatter in the conf- 
Ia and Zhost-Ia samples in the same redshift range. 

7. COMPARISONS WITH SIMULATIONS 

The Hubble diagram for the combined conf-Ia -I- Zhost- 
la sample is shown in the top panel of Figure [22] The 
scatter is (Tp, = 0.2 mag at z = 0.1 and increases to 
afj^ = 0.4 mag at z = 0.4, which is slightly larger than 
the scatter of the conf-Ia sample. 

This degradation is probably due to contamination by 
CC SN events, but to test this hypothesis, we analyzed 
the sample of simulated SDSS-II SN from KlOb. This sim- 
ulation corresponds to 10 three-season search campaigns, 
and uses the actual seeing, photometric zeropoints, and 
weather from our observing seasons. The right panel 
in Figure [25] shows the Hubble diagram using all events 
that pass the same light curve quality cuts, as well as 
identical selection criteria in Pzja- x1 r space. Specifi- 



cally, we select SN la candidates using Pzja > 0.9 and 
Xz r < 1-0 J which is approximately where the efficiency 
and purity are equal at ~ 0.90 for this simulation. The 
efficiency, purity, and figure-of-merit curves are shown 
in Figure [53] The average S/N of the Zhost-Ia sample is 
higher than that of the simulations, so we require in the 
simulations S/N > 7 in at least two of the gri bands. The 
purity of 90% for this selection is slightly lower than the 
estimated purity of the Zhost-Ia sample. 

The SN la Hubble digram was fitted to a quadratic 
function and the Hubble residuals of all candidates clas- 
sified as SN la are shown in the bottom panel of Figure[22l 
Here, the CC SN events are shown in dark (SN Ib/c) and 
light gray (SN II) points. These false-positives are adding 
scatter and a small redshift-dependent systematic shift 
relative to the SN la distances, which are represented by 
black points. The Hubble scatter around the mean for 
this simulation is tr^ — 0.2 — 0.4 mag, which is similar 
to the that of the Zhost-Ia sample over the entire redshift 
range. The larger scatter seen in the conf-Ia + Zhost- 
la sample is, therefore, most likely due to mis-classified 
CC SN as reproduced in these simulations. 

This set of simulated SDSS-II SN also includes a spec- 
troscopic SN la sample selected based on our spectro- 
scopic follow-up strategies, and represents our conf-Ia. 
The Hubble residual scatter of this spectroscopic sample 
ranges from a^ ^ 0.13 mag to a^ ~ 0.30 mag in the red- 
shift interval 0.1 < z < 0.4, which is nearly identical to 
the observed scatter of the conf-Ia sample. 

8. SUMMARY AND CONCLUSIONS 
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Fig. 21. — (Top) The Hubble residuals of the conf-Ia sample 
relative to a simple quadratic Hubble relation. The solid line rep- 
resents the mean residual and the dashed lines represent upper 
and lower rms values relative to the mean. The scatter ranges 
from (Tfi ~ 0.13 mag to a^ ~ 0.30 mag in the redshift interval 
0.1 < 2 < 0.4. (Bottom) Same except for the zi^Qst-Ia sample. The 
same quadratic has been subtracted from the measured distance 
modulus. The scatter here is larger and ranges from a^ ~ 0.2 mag 
to a/j, ~ 0.4 mag in the same redshift range. There is also a small 
redshift-dependent offset. 

We have identified 1070 photometric SN la candidates 
from the SDSS-II SN Survey data. This sample is more 
than three times larger than the spectroscopically con- 
firmed SN la sample with good light curves, and is es- 
timated to include ~ 91% of all SN la candidates de- 
tected by the survey with a purity of ~ 94% (~ 6% 
contamination). This estimate of the purity, however, is 
based on a limited number of spectroscopically confim- 
red CC SN, most of which are nearby, bright events and 
are therefore not representative of the majority of the 
contaminating events. As shown in Figure [51 the ma- 
jority of our photometric candidates have peak r-band 
S/N< 10, where we have only a handful of spectroscopic 
SN candidates. To obtain a better characterization of 
the contaminating sources, confirmation is needed for a 
much larger sample of faint CC SN that are comparable 
in apparent brightness to the phot o- la sample. As also 
advocated bv lRichards et all ()2011[ ). future surveys that 
rely on photometric identification should obtain spectra 
of SN candidates over the full range of the S/N of the 
photometric candidates of interest. 

The Hubble digram with photometric classification and 
host galaxy spectroscopic redshift priors show a slight in- 
crease in scatter over the confirmed SN la sample, which 
is consistent with them being due to mis-classified CC SN. 
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Fig. 22. — (Top) Same as in Figure [211 for the combined conf- 
Ia -I- Zhost-IS' sample, which are labeled in black and light gray, 
respectively. The same quadratic function ficonf—lai^) has been 
subtracted from the measured distance modulus. The rms scatter 
is slightly larger than that of the conf-Ia sample only. (Bottom) 
Simulated SDSS-II SN from KlOb. The black, light gray, and 
dark gray points represent SN la, SN II, and SN Ib/c, respectively, 
which pass all of the photometric SN la cuts (Pj^ > 0.9 and X? r < 
1.0). The residuals shown are relative to a quadratic fit to the 
simulated SN la sample only, whereas the rms scatter shown is for 
the full sample. Note the slight redshift-dependent bias relative to 
the SN la mean. 

There is no significant redshift-dependent offset in the 
derived distances compared to the conf-Ia sample. Sim- 
ulations confirm these findings. 

Photometric redshifts estimated from the multi-band 
light curves are unbiased below z ^ 0.2 with an rms 
dispersion of az ^ 0.05. There is a redshift-dependent 
bias above z ^ 0.2 where the mean redshift difference 
(zic — Zphoto) is between —0.04 and —0.02. The rms dis- 
persion is az ~ 0.05 — 0.10. The Hubble diagram of 
the photo-la sample also exhibits outliers and redshift- 
dependent biases. Although the distance and redshift 
accuracies at present are not yet sufficient for cosmology, 
the large sample can still be used for studies of the SN la 
rate as a function of redshift, correlations between SN 
light curves and host galaxy properties, and other stud- 
ies that do not involve joint constaints on both redshift 
and distance. 

We conclude that cosmology with future large-scale 
SN surveys should at the minimum measure host galaxy 
spectroscopic redshifts for the Hubble digram. A subset 
of the SN candidates must be observed spectroscopically 
to study the photometric classification efficiency and pu- 
rity. Spectroscopy should target candidates with S/N 
down to the magnitude limit where photometric classifi- 
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Fig. 23.— Same as in Figure [TO] for simulated SDSS-II SN from 
KlOb except with an additional constraint of S/N > 7 in two bands 
to approximately match the zjiost-IS' sample. 

cation is expected to work. Cosmology with photometry 
alone, however, requires further investigation with real- 
istic simulations in order to understand and characterize 
their systematic biases and uncertainties, and how they 
depend on the SN la candidate selection criteria. 
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